Different states of stemness of glioblastoma stem cells sustain glioblastoma subtypes indicating novel clinical biomarkers and high-efficacy customized therapies

Background Glioblastoma (GBM) is the most malignant among gliomas with an inevitable lethal outcome. The elucidation of the physiology and regulation of this tumor is mandatory to unravel novel target and effective therapeutics. Emerging concepts show that the minor subset of glioblastoma stem cells (GSCs) accounts for tumorigenicity, representing the true target for innovative therapies in GBM. Methods Here, we isolated and established functionally stable and steadily expanding GSCs lines from a large cohort of GBM patients. The molecular, functional and antigenic landscape of GBM tissues and their derivative GSCs was highlited in a side-by-side comprehensive genomic and transcriptomic characterization by ANOVA and Fisher’s exact tests. GSCs’ physio-pathological hallmarks were delineated by comparing over time in vitro and in vivo their expansion, self-renewal and tumorigenic ability with hierarchical linear models for repeated measurements and Kaplan–Meier method. Candidate biomarkers performance in discriminating GBM patients’ classification emerged by classification tree and patients’ survival analysis. Results Here, distinct biomarker signatures together with aberrant functional programs were shown to stratify GBM patients as well as their sibling GSCs population into TCGA clusters. Of importance, GSCs cells were demonstrated to fully resemble over time the molecular features of their patient of origin. Furthermore, we pointed out the existence of distinct GSCs subsets within GBM classification, inherently endowed with different self-renewal and tumorigenic potential. Particularly, classical GSCs were identified by more undifferentiated biological hallmarks, enhanced expansion and clonal capacity as compared to the more mature, relatively slow-propagating mesenchymal and proneural cells, likely endowed with a higher potential for infiltration either ex vivo or in vivo. Importantly, the combination of DCX and EGFR markers, selectively enriched among GSCs pools, almost exactly predicted GBM patients’ clusters together with their survival and drug response. Conclusions In this study we report that an inherent enrichment of distinct GSCs pools underpin the functional inter-cluster variances displayed by GBM patients. We uncover two selectively represented novel functional biomarkers capable of discriminating GBM patients’ stratification, survival and drug response, setting the stage for the determination of patient-tailored diagnostic and prognostic strategies and, mostly, for the design of appropriate, patient-selective treatment protocols. Supplementary Information The online version contains supplementary material available at 10.1186/s13046-023-02811-0.


Invasion assays
Invasion assays were performed in 6-well Transwell chambers (Corning) (1).The upper side of the filters was washed with Acetic Acid and then coated with Cultrex (Trevigen).2,5 x 10 5 cells were seeded onto the layer coated with Cultrex in the NeuroCult NS-A medium (Stem Cell Technologies).Ten to twelve days after plating, cells were fixed and stained using Hemacolor Rapid Staining Kit (Merck-Millipore).Cells on the upper side of the filters were mechanically removed, and those migrated onto the lower side were counted by microscope (2)(3)(4).

Nucleic acid purification and quantification
Genomic DNA was extracted using Blood and Cell Culture DNA Midi Kit (Qiagen) following manufacturer's protocol.To evaluate the quality and purity of the extracted DNA we used both the absorbance A260/280 and gel electrophoresis.Total RNA from fresh post-surgery tissues and GSCs cells was extracted using RNeasy Mini kit (Zymo Research) and quantified by the NanoDrop ND-2000 (Thermo Scientific) and the RNA integrity was assessed using Agilent Bioanalyzer 2100 (Agilent Technologies) (2)(3)(4)(5).

Targeted sequencing and mutation calling analysis
To get insight into the genomic events differentiating the subtypes, 31 high-grade glioma tissues, matched blood DNA (available from 27 patients) and their 32 derivative GSCs lines were sequenced by Illumina MiSeq Platform (Illumina) (3)(4)(5).Protein-coding regions of the following 27 genes typically altered in GBM and/or key regulators in stemness and differentiation were examined (6-8): ATRX, KRAS, EPHA2, NRAS, IDH1, CTNNB1, PIK3CA, WNT5A, PDGFRA, PIK3R1, TERT, BRAF, EGFR, MET, CDKN2A, BMP4, BMPR1A, BMPR1B, BMPR2, MGMT, PTEN, CCND2, CDK4, RB1, ERBB2, NF1, TP53.Illumina Design Studio was employed to design a Truseq Custom Amplicon Kit and a AmpliSeq kit (Illumina) and genomic DNA from GBM primary tissues, their matched peripheral blood and derivative GSCs lines was used to prepare libraries sequenced by Illumina MiSeq platform.After sequencing, the raw data were processed using the MiSeq Reporter software (Illumina Inc) to be demultiplexed and the resulting sequences were written in FASTQ format.GATK and VarScan were used to identify germline and somatic single nucleotide variants (SNVs) and small insertions and deletions (indels) in GBM patient's tissues with their matched blood as in (4,9,10).For the samples lacking matched blood DNA, in order to exclude germline variants, nucleotide variants were identified using GATK HaplotypeCaller, which was calibrated with a set of normal samples as a surrogate for the missing bloods.AnnoVar algorithm (11) was utilized to annotate somatic variants, which combined information from several genomic and protein databases (GENECODE, UniProt, dbNSFP) along with variant databases relevant to cancer (COSMIC, ClinVar) and non-cancer (dbSNP, 1000 Genomes, Kaviar, Haplotype Reference Consortium, Exome Aggregation Consortium, NHLBI Exome Variant Server).The variants with direct impacts on the protein sequence (missense, truncating, stoploss, splicing variants, frameshift, and in-frame indels) were selected.Variants present in non-cancer databases with a minor allele frequency ≥ 0.05 were categorized as germline polymorphisms and subsequently removed.Multiple prediction algorithms (MutationTaster2, Polyphen2, Provean, and SIFT, FATHMM-Indel, Provean, SIFT-Indel, and VEST-Indel) were utilized to determine the functional effect of missense SNVs and in-frame indels and their pathogenicity (12)(13)(14)(15).All somatic variants classified as pathogenic mutations were finally validated by Sanger sequencing using the ABI Prism BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) and the results analyzed as previously described in (4,5).All primer used are listed in the Supplementary Table 2.
Delta Ct method was used to determine the gene expression profile by quantifying mRNA levels relative to a control sample.The relative expression level is depicted in box and whiskers plot that displays the median (line inside the box), first quartile (Q1) and third quartile (Q3).
The whiskers represent the minimum and maximum value.

DNA copy number analysis
Somatic CNVs of all samples was examined by SNP array by the CytoScanHD array platform (Affymetrix) according to manufacturer's protocol and data analyzed with Partek Genomics Suite 7.0 as previously described (3,16).Genomic regions recurrently amplified or deleted regions were recognized and associated to each patient across subtypes with GISTIC2.0 (17).
Data were graphically depicted as circos plots.

Microarray analysis of gene expression
Gene expression profiling of GBM tissues (n=58) and of GSCs lines (n=27) was performed with total RNA by the Affymetrix GeneChip® Human Transcriptome Array 2.0 (Affymetrix) according to the manufacturer's instructions.Quality control steps and analysis of .CEL files was performed using R (ver.3.6.0)and differentially expressed genes were then examined with Partek Genomics Suite package ver.7.0 (3)(4)(5)18).The Benjamini-Hochberg false discovery rate (FDR) was employed to correct the P-values (q-values).Only genes with a q-value<0.05were considered differentially enriched.The most significantly enriched biological functions and diseases were identified by Ingenuity Pathway Analysis (IPA; Qiagen, http://www.ingenuity.com/) and R software (3)(4)(5) and graphed by bubble plot.
IDH1, EGFRvIII and TERTp mutational status.The most common mutations in IDH1 and the promoter of TERT gene were determined in all samples using condition as in (4,19).For analysis of EGFRvIII status, complementary DNA (cDNA) was synthesized used a SuperScript™ III Reverse Transcriptase (Thermo Fisher) following the manufactory protocol, and PCR was performed as described in (20).The PCR products for EGFR (1044bp) and EGFRvIII (243bp) were visualized using QIAxcel (an automated capillary electrophoresis system by Qiagen).Long-range PCR using 200 ng of genomic DNA was performed as describe in (21) and PCR products were separated and visualized by 0.8% agarose gel electrophoresis.